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Abstract. We present an analysis of the non-Gaussianity in the 
distribution of Lya forest lines in the QSO absorption spectra. 
Statistical tests performed on this data indicate that there may 
be large scale structure even though the power spectrum of the 
Lya line distribution on large scales is found to be flat. It is 
apparent that higher (than two) order statistics are crucial in 
quantifying the clustering behavior of Lya clouds. 

A method of detecting the spectrum of cumulants of the dis- 
crete wavelet transform (DWT) coefficients is developed. Be- 
cause the basis of the DWT is compactly supported, the DWT 
is not subject to the central limit theorem. Cumulants of any 
order can be quickly computed and serve as a way of detecting 
non-Gaussianity. 

Using this technique on three independent data sets of Lya 
forests, we find that the distribution of Lya forests does show 
non-Gaussian behavior on scales from 5 to 10 h _1 Mpc with 
confidence level larger than 95%. Two data sets available on 
large scales are found to be non-Gaussian on even larger scales. 
These techniques are effective in discriminating among models 
of the Lya forest formation, which are degenerate at second 
and lower order statistics. 

Key words: cosmology: large-scale structure of Universe - 
quasars: absorption lines 



1. Introduction 

Lya absorption line forests in QSO spectra come from inter- 
vening absorbers, or clouds, with neutral hydrogen column 
densities ranging from about 10 12 to 10 17 cm -2 at high red- 
shifts. Since the size of the Lya clouds at high red-shift is as 
large as 100 - 200 h _1 Kpc, and their velocity dispersion is 
as low as ~ 100 km s _1 (Bechtold et al. 1994, Dinshaw et al. 
1995, Fang et al. 1996), it is generally believed that the Lya 
forests are due to the absorption of pre-collapsed clouds in the 
density field of the universe. Lya clouds are probably fair trac- 
ers of the cosmic density field on large scales and therefore, 
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the clustering behavior of the Lya clouds should be useful for 
testing models of structure formation of the universe. 

More importantly, unlike other high redshift objects, Lya 
forests do not show significant power in their two-point corre- 
lation functions. Aside from very small scales Av < 300 km/s, 
all results drawn from the two-point correlation function of the 
Lya absorption lines have failed to detect clustering (Webb, 
1987, Weymann 1993, Hu et al. 1995, Cristiani et al. 1997). 
The power spectrum of the 1 -D spatial distribution of the Lya 
absorbers is found to be flat on scales in velocity space of ~ 600 
to 30,000 km s" 1 (Pando & Fang 1998). This result indicates 
that the distribution of the Lya clouds may still be in the linear 
or quasilinear evolutionary stages on scales larger than a few 
h _1 Mpc. Indeed, it is found that simulations of popular mod- 
els using the linear or log-normal approximation fit well with 
the second order statistical properties of Lya forests (Bi, Ge & 
Fang 1995, Bi & Davidson 1997). Therefore, the Lya clouds 
may contain information of cosmic clustering in the linear or 
quasilinear evolutionary stages. 

It is known that even though the evolution of the power 
spectrum during the quasi-linear regime does not significantly 
differ from the linear regime, the density perturbations on dif- 
ferent scales will no longer evolve mutually independently be- 
cause of the power transfer of perturbations via mode coupling. 
For popular models, like the cold dark matter model, the mode- 
mode coupling of the quasi-linear evolution leads to a power 
transfer from large scales to small ones (Suto & Sasaki 1991). 
Numerical studies show that the power transfer is already sig- 
nificant on scales of about 50 h _1 Mpc at redshift ~ 2 (Jing et 
al. 1995). Thus, there should exist non-Gaussianity on scales of 
a few 10 h _1 Mpc which is the "remnant" of the mode-mode 
coupling of the quasi-linear evolution. 

This theory is supported by works based on methods other 
than the two-point correlation function. For instance, the dis- 
tribution of nearest neighbor Lya line intervals is found to 
be definitely different from a Poisson process (Duncan, Os- 
triker, & Bajtlik 1989; Liu and Jones 1990). A study using the 
Kolmogorov-Smirnoff (K-S) statistic, finds that Lya absorbers 
show a deviation from a uniform random distribution at the 
~ 3(7 significance level (Fang, 1991). Some observations also 
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indicate the existence of <~ 10 Mpc void (Dobrzycki & Bech- 
told 1991), and deviation from uniform distribution on larger 
scales (Crotts 1987, 1989.) However, this individual structure 
cannot be used for a statistical analysis. Using a method based 
on cluster identification, many structures have been systemati- 
cally identified and formed into an ensemble. It is found that the 
abundance of the identified "clusters" with respect to the rich- 
ness are significantly different from a Gaussian process (Pando 
& Fang 1996, hereafter PF). Recently, we have also found that 
the Lya forest line distribution shows significant scale-scale 
correlations. As a consequence models which predict a Gaus- 
sian process for the evolution of the Lya clouds are ruled out, 
and the halos hosting the clouds must have gone through a "his- 
tory" dependent merging process during their formation (Pando 
et al. 1998.) 

In this paper, we will continue to develop the description of 
the non-Gaussianity of the Lya line distribution. The emphasis 
of this paper will be to detect the non-Gaussian spectrum, and 
to show its ability to discriminate among models of Lya cloud 
formation which are degenerate at second order. 

In §2, we will describe the observed and simulated samples 
of Lya forests, and the problems related to their large scale 
structure detection. In §3, the DWT technique of non-Gaussian 
spectrum detection will be discussed. The results of this analy- 
sis for real and simulated samples are discussed in §4. We will 
show that the distributions of Lya forest lines are significantly 
different from Gaussian distributions. Additionally, we show 
that the non-Gaussian spectrum is a powerful tool for distin- 
guishing between models. 



2. Lya samples and problems 

In PF, we looked at two popular data sets of Lya forests. The 
first compiled by Lu, Wolfe and Turnshek (1991, hereafter 
LWT) contains <~ 950 lines from the spectra of 38 QSO that ex- 
hibit neither broad absorption lines nor metal line systems. The 
second set is from Bechtold (1994, hereafter JB), which con- 
tains ~ 2800 lines from 78 QSO's spectra, in which 34 high 
red-shift QSOs were observed at moderate resolution. In this 
paper, we augment those data sets with two observations using 
the Keck telescope: 1) Hu et al. (1995, hereafter HKCSR) ob- 
served 4 QSO's with a total of 1056 lines and column density 
in the range N H i > 2 x 10 12 cm~ 2 at extremely high S/N; 
2) Kirkman and Tytler (1997, hereafter KT) obtained the high- 
est quality spectra published to date from QSO HS 1946+7658 
with 466 Lya forest lines. A typical sample is shown in Figure 
1, which is a 1-D histogram of the spatial distribution of Lya 
absorption lines of QSO-0142. 

It is well known that the number density of the Lya absorp- 
tion lines increases with red-shift. The number density of lines 
with rest equivalent width W greater than a threshold Wth can 
approximately be described as 
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Fig. 1. The Lya forest line distribution of sample Q0142 and a 
typical realization of its randomization. 



where (dN/dz) a is the number density extrapolated to zero 
red-shift, and 7 the index of evolution. LWT finds that 
(dN/dz) ^ 3 and 7 = 2.75±0.29 for lines with W > W th = 

0. 36A. KT finds 7 = 2.6 while JB finds that 7 = 1.89 ± 0.28 
for Wth > 0.32A and 7 = 1.32 ± 0.24 for W th > 0.16A. 

Like other Lya forest data, these data sets have failed to re- 
veal structures in their distribution on scales Ai; > 300 km/s 
when subjected to a two point correlation analysis (Hu et al. 
1995.) Using the discrete wavelet transform spectrum estima- 
tor, the Fourier power spectrum of the 1 -dimensional (1-D) spa- 
tial distribution of these data is found to be almost flat on scales 
from 2 to about 100 h" 1 Mpc ( Pando & Fang 1998). For com- 
parison, Figure 1 also shows a randomized sample which is 
produced by a random shifting of the lines of QSO-0142. Ob- 
viously, one cannot simply distinguish the real sample and its 
random counterpart by inspection. In fact, when analyzed by 
second order statistics, the real data are still indistinguishable 
from randomized samples. 

These results indicate that 2nd order statistical techniques, 

1. e. the two-point correlation function and the power spectrum, 
are not even qualitatively sufficient to describe the clustering 
features of these samples. Higher order measures are not a cor- 
rection to lower order descriptions, but crucial in describing the 
Lya forest traced matter field. 

This conclusion is strengthened by studying simulated 
samples. Typically, simulated density fields for pre-collapsed 
clouds are generated as perturbations with a linear or linear log- 
normal spectrum given by models such as the cold dark matter 
model (SCDM), the cold plus hot dark matter model (CHDM), 
and the low density flat cold dark matter model (LCDM). The 
baryonic matter distribution is then produced by assuming that 
the baryonic matter traces the dark matter distribution on scales 
larger than the Jeans length of the baryonic gas, but is smooth 
over structures on scales less than the Jeans length. The simu- 
lated Lya absorption spectrum can be calculated as the absorp- 
tion of neutral hydrogen in the baryonic gas. The effects of the 
observational instrumental point-spread-function, along with 
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Poisson and background noises can also be simulated properly 
(Bi, Ge & Fang 1995, hereafter BGF; Bi & Davidson 1997). 

Within a reasonable set of parameters, the simulated sam- 
ples are found to fit with observational measurements such as 
the number density of Lya clouds, the distribution of equiva- 
lent widths, the red-shift-dependence of the width distributions 
and clustering. Regardless the details of the simulation, these 
samples should contain structures because the effects of grav- 
itational collapse have been considered, and baryonic matter 
does not distribute uniformly random, but traces the structure 
of dark matter. However, as is the case with observations, the 
simulated samples show no power in their two-point correla- 
tions (see Figure. 11 of BGF), and their 1-D spectra are rather 
flat on scales less than 100 h _1 Mpc. These results clearly show 
that higher order measures are necessary in describing the sta- 
tistical features of the 1-D distributions of the Lya clouds. 



3. Higher Order Statistics via the Wavelet Coefficients 

3.1. DWT decomposition and 2nd Order Statistics 

It is generally believed that the cosmic mass (or number) den- 
sity distribution p(x) can be mathematically treated as a ho- 
mogeneous random field. It is often more convenient to ex- 
press p(x) in terms of its Fourier transform, p(k), which is the 
Fourier coefficient of p(x). 

One reason for expressing the density distribution in terms 
of its Fourier transform is that for Gaussian random fields all 
the statistical features of p(x) can be completely described by 
the amplitude of the Fourier coefficients. In this case, the phase 
of p(k) is not important and the power spectrum of the per- 
turbations, ^(fc)! 2 , or equivalently, the two-point correlation 
function, are all that is necessary to describe the statistical be- 
havior of the density distribution. However, if the field p{x) is 
non-Gaussian, then in order to have a full statistical descrip- 
tion of the field the phases of the Fourier coefficients p(k) are 
essential. 

As is well known, it is difficult, even practically impossible, 
to find information about the phases of the Fourier coefficient 
as soon as there is some computational noise. The lack of in- 
formation about the phases makes the p(k) description incom- 
plete: we might know the scales k of the structures, but noth- 
ing about the positions of the considered structures. A possi- 
ble way of simultaneously describing the scale and position of 
structures is provided by the discrete wavelet transform (DWT) 
(Daubechies 1992, Meyer 1993.) The DWT analysis has suc- 
cessfully been applied to the problems of turbulence (Yamada 
& Ohkitani 1991; Farge 1992) and high energy physics (Huang 
et al. 1996.) Our previous studies also show that the DWT anal- 
ysis is useful for large scale structure study (Pando & Fang 
1996, Fang & Pando 1997, 1998; Pando et al. 1998, Fang, Deng 
& Xia, 1998, Xu, Fang & Wu, 1998.) 



Let's consider a 1-D mass density contrast 6(x) = [p(x) — 
p]/p, which covers a spatial range < x < L. The expansion 
of the field p(x) in terms of the DWT basis is given by 



5(x 



oo 2 j -1 



(2) 



j=0 1=0 



where ipj.i(x), j = 0,1,..., I = 0...2-7 — 1 are the basis of 
the DWT. The DWT basis are orthogonal and complete. The 
wavelet function coefficient (WFC), Sjj, is computed by 



5j,i = / S(x)ip h i(x)dx. 



(3) 



The wavelet transform basis functions ipjj(x) are gener- 
ated from the basic wavelet ip(x/L) by a dilation, 2 J , and a 
translation I, i.e. 



2-' 



1/2 



(4) 



The basic wavelet xp is designed to be continuous, admissi- 
ble and localized. Unlike the Fourier basis cxp(— i2Trnx/L), 
which are non-local in physical space, the wavelet basis ipjj (x) 
are localized in both physical space and Fourier (scale) space. 
In physical space, ipj,i(x) is centered at position lL/2\ and in 
Fourier space, it is centered at wavenumber 2ir x 2 J / L. There- 
fore, the WFCs, Sj t i, have two subscripts j and I, which de- 
scribe, respectively, the scale and position of the density per- 
turbations. 

A clearer picture of how the transforms work can be seen 
in the phase space (x, k). A complete, orthogonal basis set re- 
solves the whole phase space into "elements" of size Ax and 
Ak. Each mode corresponds to elements in the phase space. 
For the Fourier transform, this corresponds to elements of size 
Afc = and Ax = oo. For the wavelet transform, both Afc and 
Ax are finite, and the corresponding area of the element is as 
small as AxAk = 2tt. That is, the DWT is able to resolve an 
arbitrary function simultaneously in terms of x and k up to the 
limit of the uncertainty principle. The DWT decomposes the 
density fluctuation field S(x) into domains j, I in phase space, 
and for each mode, the corresponding area in the phase space 
is as small as that allowed by the uncertainty principle. 

The WFC Sj.i and its intensity |<5j,i| 2 describe, respectively, 
the fluctuation of the density and its power on scale L/2 J at 
position IL/2 J . As with the Fourier basis, Parseval's theorem 
holds for the DWT basis. It is (Fang & Pando 1997, Pando & 
Fang, 1998) 



oo 2 J -1 



- / \8(x)fdx = J2Y, 
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(5) 



It is possible to define the power spectrum of the density per- 
turbation on scale j by the variance of the WFCs as 



2 J -1 



(6) 
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where 



v-i 



1=0 



It has been shown that the DWT power spectrum eq.(6) can be 
converted to the Fourier power spectrum, i.e. in terms of second 
order statistical description the DWT and Fourier transform are 
equivalent. 

3.2. One-point distribution ofWFCs and non-Gaussianity 

The cosmic density field is usually assumed to be ergodic: the 
average over an ensemble is equal to the spatial average taken 
over one realization. This is the so-called "fair sample hypoth- 
esis" (Peebles 1980). A homogeneous Gaussian field with con- 
tinuous spectrum is certainly ergodic (Adler 1981). In some 
non-Gaussian cases, such as homogeneous and isotropic turbu- 
lence (Vanmarke, 1983), ergodicity also approximately holds. 
Roughly, the ergodic hypothesis is reasonable if spatial corre- 
lations are decreasing sufficiently rapidly with increasing sep- 
aration. In this case, the volumes separated by large distances 
are approximately statistically independent, and can be treated 
as independent realizations. Note that the ipj,i{x) are orthogo- 
nal with respect to the position index I, and therefore, for an 
ergodic field, the 23 WFCs, 6 jt i, I = 0, 1...2-? - 1, at a given 
j should be statistically independent. Thus the WFCs Sjj at 
fixed j and different I can be treated as independent measures 
of the density fluctuation field. The 2 J WFCs, 6j i, from one 
realization of 6(x) can be employed as a statistical ensemble. 
In this way, when the fair sample hypothesis holds, an average 
of the ensemble can be well estimated by averaging over I, i.e. 

(5j t i) ~ (1/2- 7 ') X^o* wnere (••■) denotes the ensemble 
average. 

Consequently, the distribution of the Sjj is actually the one- 
point distribution of the WFCs at a given scale j. The non- 
Gaussianity of the density field S(x) can directly be measured 
by the deviation of the one-point distribution from a Gaussian 
distribution. For this purpose, we can calculate the cumulant 
moments defined by (Carruthers 1991; Carruthers, Eggers & 
Sarcevic 1991) 



For Gaussian fields all the cumulant moments higher than 
order 2 are zero. Thus one can measure the non-Gaussianity of 
(7) the density field 5(x) by /" with n > 2. Analogous to ij be- 
ing called the DWT power spectrum, we will call 7" the DWT 
spectrum of n-th cumulant. The cumulant measures i| and ij 
are related to the better known skewness and kurtosis, respec- 
tively. Thus, the non-Gaussianity of S(x) can be detected by 
the DWT skewness and kurtosis spectra defined as 



Mi 
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where 
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(8) 
(9) 
(10) 
(11) 

(12) 



From eqs.(6), (7) and (8), one sees that the second order cumu- 
lant moment is the DWT power spectrum on the scale j, i.e. 
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(13) 



Sj = 



(7?)3/2V 



(W 2 3 



(14) 



(15) 



Sj and Kj are basic statistical measures employed in this paper. 

For comparison, the definitions of the "standard" skewness 
S and kurtosis K for a 1-D distribution S(x) covering on N 
bins are given as follows 



i N 



and 



i N 



(16) 



(17) 



i=l 



where the variance a 2 is given by 
i N 

1=1 

Obviously, no scale information is given by S and K. 



(18) 



3.3. Central limit theorem and the DWT analysis 

It is well known that not all one-point distributions can detect 
non-Gaussianities . This is due to the constraints imposed by 
the central limit theorem. For instance, if the universe con- 
sists of a large number of dense clumps with a non-Gaussian 
probability distribution function (PDF), the one-point distribu- 
tions of the real and imaginary components of each individual 
Fourier mode are still Gaussian due to the central limit theo- 
rem (Ivanonv & Leonenko 1989). Further, even when the non- 
Gaussian clumps are correlated the central limit theorem still 
holds if the two-point correlation function of the clumps ap- 
proaches zero sufficiently fast (Fan & Bardeen 1995). For these 
reasons, the one-point distribution function of Fourier modes is 
not sensitive enough to detect deviations from Gaussian behav- 
ior. Even for samples with strong non-linear evolution, the one 
point distribution function of Fourier modes is found to be con- 
sistent with a Gaussian distribution (Suginohara & Suto 1991). 
It should be pointed out that the inefficiency of the Fourier 
mode one-point distribution in detecting non-Gaussianity is not 
because Fourier transform loses information about the distribu- 
tion f(x). The Fourier coefficients contain all the information 
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on non-Gaussianity, but the information is mainly contained in 
the phases of the Fourier coefficients. As mentioned in §3.1, 
it is very difficult to detect the distribution of the phases of 
Fourier coefficients. 

On the other hand, the wavelet coefficients are not subject 
to the central limit theorem. In this respect, the DWT analysis 
is similar to the count in cell (CIC) method (Hamilton 1985; 
Alimi, Blanchard & Schaeffer 1990; Gaztanaga & Yokoyama 
1993; Bouchet et al. 1993; Kofman et al. 1994; Gaztanaga & 
Frieman 1994). The CIC is not subject to the central limit theo- 
rem as it based on spatially localized window functions (Adler 
1981). The DWT's basis are also localized. If the scale of a 
clump in the universe is d, eq.(3) shows that the WFCs, Sjj, 
with j = \og 2 (L/d), is determined only by the density field in 
a range containing no more than one clump. That is, for scale 
j the wavelet coefficients are not given by a superposition of a 
large number of clumps, but are determined by at most one of 
them. Therefore, it avoids the constraints imposed by the cen- 
tral limit theorem. 

This point can also be shown from the orthonormal basis 
being used for the expansion of the distribution 5(x). A basic 
condition of the central limit theorem is that the modulus of 
the basis be less than C / \[L, where C is a constant (Ivanonv 
& Leonenko 1989). Obviously, all Fourier-related orthonormal 
basis satisfy this condition because (l/\/Z)| sinfc:r| < C/^/L 
and (l/Vi) | cos kx\ < Cj VZ, where C is independent of co- 
ordinates in both physical space x and scale space k. On the 
other hand, the normalized wavelets functions have [eq.(4)] 

~ (j) 0(1)- (19) 

Because the magnitude of the basic wavelet ip(x) is of order 
1. The condition \4>j,i( x )\ < C/^/~L will no longer hold for 
a constant C independent of scale variable j. Hence, the one- 
point-distribution of wavelet coefficients on scale j should be a 
good estimator for the PDF of a distribution. 

4. Skewness and kurtosis spectra of Lye* forests 

4.1. Data preparation 

The observational QSO Lya forests data have a complex ge- 
ometry. By this we mean that different forests cover different 
spatial ranges, and no one of the forests distributes on the entire 
range {D mini D max ). At the very least, a complicated weight- 
ing scheme is needed in order to form an ensemble. The DWT 
provides a way around this problem. Because wavelets are lo- 
calized, the wavelet coefficients are only determined by the lo- 
cal distribution. If a forest sample lies in range (D\, D 2 ), one 
can extend it to (D min , D max ) by adding zeros to the data in 
ranges (D min , Di) and (D 2 , D max ). The wavelet coefficients 
in the interval (D\, D 2 ) will not be affected by the addition of 
zero in (D min ,D\) and (D 2 , D max ) . Any statistics can then be 
computed by the ensemble of wavelet coefficients, Sjj, by sim- 
ply dropping all wavelet coefficients with I in the added zero 
ranges. 



400.0 
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Fig. 2. Frequency distributions of Kg from 5000 random sam- 
ples. It can be clearly seen that this distribution is non- 
Gaussian. 

More precisely, the effect of zero padding on the wavelet 
coefficients is described by the so-called "influence cone" 
which consists of the spatial support of all dilated wavelet func- 
tions. For instance, if ipji{x) is localized in the space interval 
Ax for j = 0, the influence cone centered at Xq is given by 
x € [x a -(Ax/23 +1 ),x Q + (Ax/23 +1 )} (Farge 1992). Wavelet 
coefficients corresponding to positions outside (Ax/2 :,+1 ) will 
not corrupt information within the influence cone. 

The LWT and JB samples cover a red-shift range of 1 .7 to 

4.1, i.e. the spatial range in comoving distance is from about 
D mm =2,300 h^Mpc to D rnax =3,300 h^Mpc, if q = 1/2 
and h= iJo/100 km s _1 Mpc -1 . To eliminate the proximity 
effect, all lines with z > z em — 0.15 are deleted from our sam- 
ples. These two samples are extended to (D m i n , D max ) as de- 
scribed above, and binned into in 512 bins with comoving size 
~ 2.5 h _1 Mpc, which is about the scale where the effect of 
line blending occurs. If a line is not located at the center of 
a bin, it is separated into the two neighboring bins, weighted 
according to the distance to each of the centers. 

The HKCSR and KT samples (HKCSR+KT), can be 
treated the same way. Since HKCSR and KT cover a 
smaller red-shift range from about 2.4 to 3.1, the forests of 
HKCSR+KT are extended to 128 bins of comoving size <~ 2.5 
h _1 Mpc. Thus it is possible to compare all 3 sets for scales 
down to j = 6. 

4.2. Skewness and kurtosis spectra of real data 

As opposed to 6(x) discussed in §3, real data provide only his- 
tograms of Lya line distributions, i.e. the data are not a contin- 
uous function of x, but a point process in x space. In this case, 
the WFCs Sjj will not be calculated by an integral like eq.(3), 
but by the wavelet transformation matrix (Press et al. 1992.) 

It is well known that point processes generally are non- 
Gaussian. For instance, if the Lya clouds do not distribute 
structurally, then 8(x) should be a 1-D Poisson process. How- 
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► HKCSR+KT 
- RSEM 



► HKCSR+KT 
■ RSEM 



5.0 6.0 7.0 



Fig. 3. Skewness spectrum for LWT (W > 0.36A), JB (W > 
0.32A) and HKCSR+KT samples. The red-shift evolution 
model (RSEM) given by eq. (1), and the 95% confidence limits 
are also shown. The physical scale is related to j by 2.5 x 2 10- - 7 
h- 1 Mpc. 



ever, Poisson processes are non-Gaussian. Obviously, this non- 
Gaussianity is not what we are seeking to measure because it is 
not the result of non-linear evolution or physical processes re- 
lated to structure formation. Therefore, we should carefully dis- 
tinguish the clustering related non-Gaussianity with that from 
sampling and binning. For this purpose, we generate 5000 ran- 
dom realizations in which the mean number density follows 
eq.(l). Each realization is treated in the same way as the real 
data sets. From these random samples we can produce fre- 
quency distributions for each statistic, say Kg (Figure 3). These 
frequency distributions are fair estimators of the underlying 
probability distribution. It can clearly be seen from Figure 2 
that the random data are non-Gaussian. 

Using these frequency distributions, the confidence levels 
for the mean values of the random data can be estimated. These 
are then to be compared with the real data. 

The skewness and kurtosis spectra of the LWT (W> 0.36 
A), JB (W> 0.32 A) and HKCST+KT (b > 78 km s" 1 ) data 
are shown in Figures 3 and 4, respectively. The skewness and 
kurtosis spectra of random data and the 95% confidence lev- 
els are also shown in these figures. Figures 3 and 4 show that 
the skewness and the kurtosis spectra of the LWT (W> 0.36A) 
and JB (W> 0.32A) are almost the same. The kurtosis spectra 
are > 95%, significantly different from the random samples on 
all scales, and, on scales j= 5, the skewness spectra are also 
> 95% and significantly different from the random samples. 
The Keck data, i.e. HKCSR+KT, have qualitatively the same 
non-Gaussian behavior, especially at Kg and Kg, where the 
HKCSR+KT data are the same as that of LWT and JB. The 
HKCSR+KT gives lower values for K 6 and K 7 than the LWT 
and JB, while it gives higher Sg and Sg than LWT and JB. With 
the current data it is not possible to determine whether these 
differences are due to the higher resolution of the HKCSR+KT 
data. It is important to realize that these observational data sets 



Fig. 4. Kurtosis spectrum for LWT (W > 0.36A), JB (W > 
0.32A) and HKCSR+KT samples. The red-shift evolution 
model (RSEM) given by eq. (1), and the 95% confidence limits 
are also shown. 



j = 6 










1 .0 -1 .0 



1.0 -1.0 



Fig. 5. Histogram of one-point distribution of wavelet coeffi- 
cients for JB (W> 0.32A). At each scale j, the Gaussian distri- 
bution (dashed line) has the same variance and normalization 
as the wavelet coefficient distribution. 



were compiled by different groups, on different instruments, 
and as much as 6 years apart. These data sets are very indepen- 
dent and makes more convincing the case for the existence of 
the non-Gaussianity at least on scales of 5 - 10 h _1 Mpc, and 
confirms that the features shown in the non-Gaussian spectrum 
are intrinsic features of the density field traced by Lya forests. 

We can directly describe the non-Gaussianity by the one- 
point distributions of the WFCs. Figure 5 plots the one-point 
distributions of Sj.i for the JB sample with > 0.32A. For each 
scale j, the corresponding Gaussian distribution is plotted such 
that it has the same normalization and variance as the one-point 
distribution. This figure clearly shows that all the distributions 
on scales j > 4 (or less than about 80 h _1 Mpc) are signifi- 
cantly non-Gaussian. These distributions are also asymmetric, 
with fewer positive wavelet coefficients. 
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width W > 0.32A. The error bars in Figure 6 are given by the 
distribution of 20 realizations for each model. 

Figure 6 shows that even though the BGF simulation is 
based on the linear power spectrum of the density perturba- 
tions, the Lya distribution is non-Gaussian. This is because the 
selection of peaks from the density field is a non-Gaussian pro- 
cesses. More important, Figure 7 clearly shows the significant 
difference among the three models. With 95% confidence, the 
Kj amplitudes of all three models are different on all scales. 
The Kj of the SCDM model is significantly larger than zero 
for all scales j. Yet, the LCDM model gives much lower Kj for 
all scales j. The non-Gaussian spectrum provides a extremely 
effective method for removing the degeneracy among models. 



Fig. 6. Kurtosis spectrum of BGF samples in the SCDM, 
LCDM, and CHDM models with W > 0.32A. 



This asymmetry is at least partially due to the red-shift- 
dependence of the Lya clouds. The wavelet coefficient Sj-ij 
is mainly determined by the difference of (positive) densities 
at {j, 21} and {j, 21 + 1} (PF). For a clump in red-shift space, 
the density change on the lower red-shift or lower I side con- 
tributes negative wavelet coefficients, while the higher red-shift 
side gives positive wavelet coefficients. If as shown in PF, the 
number of Lya clumps decreases with increasing red-shift, the 
change in clustering amplitudes (wavelet coefficients) on the 
higher red-shift side (positive wavelet coefficients) should be 
less than the lower side (negative wavelet coefficients). That 
is, the number of positive wavelet coefficients will be less than 
negative wavelet coefficients. This is consistent with a small, 
but positive skewness. 

4.3. Removing degeneracy by the non-Gaussian spectrum 

We have shown that cluster identification by a wavelet decom- 
position is a useful tool for discriminating among models of 
Lya clouds (PF). The spectrum of non-Gaussianity can play 
the same role. Namely, Sj and Kj are effective measures for re- 
moving the degeneracy that exists at second order among mod- 
els. 

As an example, we examined the BGF simulated Lya forest 
samples. This simulation shows that two models, SCDM and 
LCDM, are degenerate if only the first (number density) and 
second (variance, or power spectrum) order statistics are con- 
sidered. That is, both SCDM and LCDM give about the same 
predictions for the following features of the Lya forests: a.) the 
number density of Lya lines and its dependencies on red-shift 
and equivalent width; b.) the distribution of equivalent widths 
and its red-shift dependence; c.) the two-point correlation func- 
tion. 

This degeneracy can be removed by the non-Gaussian spec- 
trum. Figure 6 plots the kurtosis spectra for the LCDM, SCDM, 
and CHDM models, in which the Lya lines are chosen with 



5. Discussion and conclusions 

The DWT skewness and kurtosis spectra of real and simulated 
samples of Lya forests has been calculated. Deviations from 
a Gaussian state in these samples are detected for all data on 
scales of 5 to 10 h _1 Mpc with confidence level larger than 
95%. For the two sets which are available on large scales, the 
non-Gaussianity is significant even on scales as large as about 
80 h _1 Mpc. The scale-dependence (or spectrum) for a The 
higher order DWT cumulant is especially effective in distin- 
guishing the real samples from random data, and in discrimi- 
nating among models. 

It is possible for non-Gaussianity to result from systematic 
errors hiding in the original data reduction. For instance, dis- 
crete sampling or binning of a continuous distribution into a 
histogram with bin size AD leads to non-Gaussianity at least 
on scale AD. Many selection effects actually are some kind 
of sampling, and therefore, they will also give rise to non- 
Gaussianity. All these non-Gaussianities are "spurious". 

Fortunately, in most cases these spurious non-Gaussian sig- 
nals are significant only at one special scale. For a Poisson pro- 
cess, it is the scale of the mean distance of nearest neighbor 
clouds. For binning, it is AD. On scales larger than AD, the 
spurious non-Gaussianities will decay out. That is, the scale- 
dependent behavior of sampling and binning is useful in recog- 
nizing spurious non-Gaussianity. In the case of the Lya forests, 
the mean nearest neighbor distance along with the binning 
scale AD are much less than the scales of the detected non- 
Gaussianities, and therefore, the detected non-Gaussianities are 
inherent to the distributions. There is also evidence for intrinsic 
scales in the Lya spectra of magnitude <~ 25 h _1 Mpc (Rauch 
1998). However this contamination can be reduced by studying 
Kj and Sj on scales different from 25 h _1 Mpc. 

Nevertheless, high-resolution samples which can cover the 
spatial range as 1 arge as 80 h _1 Mpc are needed before a final 
determination can be made. Recently a scale-scale correlation 
analysis also shows a non-Gaussianity in the Lya line distribu- 
tion on such large scales (Pando et al. 1998). It seems to support 
our conclusion that cosmic matter underwent a non-Gaussian 
process on quite large scales at quite early times. 

Last but not least, the numerical work of calculating DWT 
higher order moment spectrum is not any more difficult than 
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calculating the second order moments. Generally, the calcu- 
lation of third and higher order correlation functions of large 
scale structure samples is very strenuous work. But the numer- 
ical work involved in calculating the DWT is about the same 
as the FFT, and can even be faster. The FFT requires ~ N Log 
N calculations, while the DWT, using a "pyramid" scheme, re- 
quires only order N calculations (Press et al. 1992). 

This method can easily be generalized to 2- and 3- 
dimensions. The kurtosis and skewness spectrum opens a new 
window for looking at the statistical features of large scale 
structures. It is an important and necessary addition to the ex- 
isting methods of describing the clustering and correlation of 
the cosmic density field, and for discriminating among models 
of structure formation. 
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